Efficient Importance Sampling for Binary Contingency Tables
نویسنده
چکیده
Importance sampling has been reported to produce algorithms with excellent empirical performance in counting problems. However, the theoretical support for its efficiency in these applications has been very limited. In this paper, we propose a methodology that can be used to design efficient importance sampling algorithms for counting and test their efficiency rigorously. We apply our techniques after transforming the problem into a rare-event simulation problem— thereby connecting complexity analysis of counting problems with efficiency in the context of rare-event simulation. As an illustration of our approach, we consider the problem of counting the number of binary tables with fixed column and row sums, cj ’s and ri’s, respectively, and total marginal sums d= ∑ j cj . Assuming that maxj cj = o(d), ∑ c 2 j =O(d) and the rj ’s are bounded, we show that a suitable importance sampling algorithm, proposed by Chen et al. [J. Amer. Statist. Assoc. 100 (2005) 109–120], requires O(dεδ) operations to produce an estimate that has ε-relative error with probability 1− δ. In addition, if maxj cj = o(d 0) for some δ0 > 0, the same coverage can be guaranteed with O(dε log(δ)) operations.
منابع مشابه
Estimating the number of zero-one multi-way tables via sequential importance sampling
In 2005, Chen et al. introduced a sequential importance sampling (SIS) procedure to analyze zero-one two-way tables with given fixed marginal sums (row and column sums) via the conditional Poisson (CP) distribution. They showed that compared with Monte Carlo Markov chain (MCMC)-based approaches, their importance sampling method is more efficient in terms of running time and also provides an eas...
متن کاملSequential Monte Carlo Methods for Statistical Analysis of Tables
We describe a sequential importance sampling (SIS) procedure for analyzing two-way zero–one or contingency tables with fixed marginal sums. An essential feature of the new method is that it samples the columns of the table progressively according to certain special distributions. Our method produces Monte Carlo samples that are remarkably close to the uniform distribution, enabling one to appro...
متن کاملConditional Inference on Tables with Structural Zeros
We describe a sequential importance sampling approach to making conditional inferences on two-way zero-one and contingency tables with fixed marginal sums and a given set of structural zeros. Our method enables us to approximate closely the null distributions of various test statistics about these tables, as well as to obtain an accurate estimate of the total number of tables satisfying the con...
متن کاملLattice Points, Contingency Tables, and Sampling
Markov chains and sequential importance sampling (SIS) are described as two leading sampling methods for Monte Carlo computations in exact conditional inference on discrete data in contingency tables. Examples are explained from genotype data analysis, graphical models, and logistic regression. A new Markov chain and implementation of SIS are described for logistic regression.
متن کاملCharacterizing Optimal Sampling of Binary Contingency Tables via the Configuration Model
A binary contingency table is an m×n array of binary entries with row sums r = (r1, . . . , rm) and column sums c = (c1, . . . , cn). The configuration model generates a contingency table by considering ri tokens of type 1 for each row i and cj tokens of type 2 for each column j, and then taking a uniformly random pairing between type-1 and type-2 tokens. We give a necessary and sufficient cond...
متن کامل